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Mediation and Spillover Effects in Group-Randomized 
Trials: A Case Study of the 4Rs Educational 


Intervention 


Tyler J. VANDERWEELE, Guanglei HONG, Stephanie M. JONES, and Joshua L. BROWN 


Peer influence and social interactions can give rise to spillover effects in which the exposure of one individual may affect outcomes of other 
individuals. Even if the intervention under study occurs at the group or cluster level as in group-randomized trials, spillover effects can 
occur when the mediator of interest is measured at a lower level than the treatment. Evaluators who choose groups rather than individuals as 
experimental units in a randomized trial often anticipate that the desirable changes in targeted social behaviors will be reinforced through 
interference among individuals in a group exposed to the same treatment. In an empirical evaluation of the effect of a school-wide intervention 
on reducing individual students’ depressive symptoms, schools in matched pairs were randomly assigned to the 4Rs intervention or the 
control condition. Class quality was hypothesized as an important mediator assessed at the classroom level. We reason that the quality 
of one classroom may affect outcomes of children in another classroom because children interact not simply with their classmates but 
also with those from other classes in the hallways or on the playground. In investigating the role of class quality as a mediator, failure 
to account for such spillover effects of one classroom on the outcomes of children in other classrooms can potentially result in bias and 
problems with interpretation. Using a counterfactual conceptualization of direct, indirect, and spillover effects, we provide a framework that 
can accommodate issues of mediation and spillover effects in group randomized trials. We show that the total effect can be decomposed 
into a natural direct effect, a within-classroom mediated effect, and a spillover mediated effect. We give identification conditions for each 
of the causal effects of interest and provide results on the consequences of ignoring “interference” or “spillover effects” when they are 
in fact present. Our modeling approach disentangles these effects. The analysis examines whether the 4Rs intervention has an effect on 
childrens’ depressive symptoms through changing the quality of other classes as well as through changing the quality of a child’s own class. 


Supplementary materials for this article are available online. 


KEY WORDS: | Direct/indirect effects; Interference; Multilevel models; Social interactions. 


1. INTRODUCTION 


Can schools do a better job improving children’s social- 
emotional well-being? The Reading, Writing, Respect, and Res- 
olution (4Rs) program is aimed at promoting not only liter- 
acy development but also intergroup understanding and conflict 
resolution. This school-wide intervention program has three 
components: (i) a literacy-based curriculum in conflict reso- 
lution and social-emotional learning, (ii) training and ongoing 
coaching of teachers in the delivery of the 4Rs curriculum, and 
(iii) a family-based parent-child homework arrangement. The 
intervention was designed for an entire school on the basis of 
the theory that there would be reinforcement among teachers and 
students across different classrooms in the same building. In par- 
ticular, students’ social-emotional development was expected to 
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improve by benefitting not only from what would potentially be 
an enhanced environment within their own classroom but also 
from an enhanced environment in the school as a whole. This is 
because students interact not only within a classroom but also 
in the hallways, in the cafeteria, and on the playground with 
students from other classrooms. Therefore, the improvement of 
the quality of one classroom might affect the social-emotional 
outcomes of children enrolled in other classrooms in the same 
school. 

This study investigates the mediating role of changes in class 
quality induced by the 4Rs intervention in influencing child 
depressive symptoms. Class quality encompasses instructional 
support, emotional support, and organizational climate. Specif- 
ically, we evaluate the extent to which the effect of the 4Rs 
intervention on children is (i) affected by a child’s own class- 
room quality, (ii) affected by the quality of classes other than the 
child’s own, and (iii) potentially through pathways other than 
classroom quality. We analyze data from a group randomized 
experiment conducted in New York City in which elementary 
schools in matched pairs were assigned at random to either the 
4Rs program or the control condition (Brown et al. 2010; Jones, 
Brown, and Aber 2011). 

A conventional approach to mediation analysis consists of 
regressing the outcome on the treatment with and without the 
mediator variable. This approach takes the coefficient for the 
treatment in the model without the mediator as a “total effect;” 
it takes the coefficient for the treatment in the model with the 
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mediator as a “direct effect.” The difference between the “to- 
tal effect” and the “direct effect” is a measure of the “indirect 
effect.” This approach to mediation is sometimes referred to 
as the “difference method.” When mediator and outcome are 
both continuous, this approach gives the same estimate as the 
“product-of-coefficients method,” which itself takes as the “‘in- 
direct effect’ the product of the coefficient of the treatment in a 
model for the mediator and the coefficient of the mediator in a 
model for the outcome. This product-of-coefficients method is 
what is typically employed in mediation analyses with structural 
equation models. Although the product and difference methods 
give the same estimates with linear models, similar results do 
not hold with dichotomous outcomes or in nonlinear model 
(MacKinnon 2008). 

The conventional approach is subject to several limitations 
when applied to the current study. First, the standard regres- 
sion approach typically ignores selection into mediator levels. 
Although the treatment was randomized, teachers and students 
within a school were not randomly assigned to alternative levels 
of class quality. Analyses ignoring this selection issue are sub- 
ject to potentially severe biases due to confounding (Judd and 
Kenny 1981; Robins and Greenland 1992; Pearl 2001). While 
the problem of selection into mediator levels is endemic to all 
meditational studies, the current analysis conditions on covari- 
ates to make the “no-selection” assumption more plausible. The 
second issue with the standard regression approach is that poten- 
tial interactions between the effects of treatment and mediator 
on the outcome are typically ignored. The assumption of no 
treatment-mediator interaction would be violated if, for exam- 
ple, class quality had a larger effect on child outcomes in 4Rs 
schools than in control schools. Recent literature on causal in- 
ference has made clear that mediation analysis becomes consid- 
erably more complex when such interactions are present (Pearl 
2001; Valeri and VanderWeele 2013). 

Perhaps even more importantly, the literature on mediation 
has ignored spillover effects in organizational settings. The is- 
sue is referred to as one of “interference between units” in the 
statistics literature (Cox 1958). No interference between units 
is a component of Rubin’s Stable Unit Treatment Value As- 
sumption or SUTVA (Rubin 1980, 1986). The assumption will 
be violated in settings in which social interactions allow one 
individual’s exposure or treatment assignment to affect the out- 
comes of other individuals. The assumption will also be violated 
in mediation studies if the mediator value displayed by one unit 
affects the outcomes of other units even if these units have been 
assigned to the same treatment. Such interference is part of 
the rationale of the 4Rs program. In theory, a child’s social- 
emotional outcomes may depend not only on the class quality 
experienced in the child’s own classroom but also on the quality 
of other classrooms at the same school due to social interactions 
among children from different classrooms. The school-wide in- 
tervention was designed to bring together educators’ collective 
efforts within a school so as to maximize positive social in- 
teractions and minimize negative interactions among children. 
Conceptualizing and analyzing the mediation effects are con- 
siderably more complex in the face of such interference. 

The literature on mediation in a multilevel setting (Vander- 
Weele 2010a) and on causal inference under interference (Hong 
and Raudenbush 2006; Sobel 2006; Rosenbaum 2007; Hudgens 
and Halloran 2008; Tchetgen Tchetgen and VanderWeele 2012) 
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has not explicitly considered how to deal with the assessment of 
spillover effects in mediation analysis. To address the question 
of spillover effects in the 4Rs intervention, we will extend earlier 
work. Specifically, our work extends beyond the results of Van- 
derWeele (2010a) by (i) giving counterfactual definitions for 
within-classroom and spillover mediated effects and showing 
that not only does the total effect of the intervention decompose 
into a direct effect and an indirect effect but also that the in- 
direct effect itself decomposes into an effect mediated through 
the quality of a child’s own classroom and a spillover effect 
mediated by the quality of the other classrooms at a school; (ii) 
giving identification results for within-classroom and spillover 
mediated effect; (iii) giving two results on the consequences of 
ignoring interference when it is present; (iv) mapping the con- 
ceptual framework to a class of models that disentangle these 
effects; and (v) applying the methodology to the 4R’s interven- 
tion study. 

This case study itself will provide a basic template for the nu- 
merous group-randomized trials and quasi-experimental studies 
in which questions of mediation are of scientific interest and in- 
terference is likely present. Interference among units is common 
in social settings. For example, banning smoking in office build- 
ings could affect individual health not only through changing 
one’s own smoking behavior but also through the change in 
behavior of one’s colleagues. We hope that the approach de- 
scribed in the present article will offer a new framework and 
concrete guidance regarding how to investigate causal mecha- 
nisms that involve interference among units. This template will 
be applicable to settings in which the treatment is assigned at 
the organization or community level while the mediator and the 
outcome are at lower levels than the treatment. The article not 
only provides concepts and results for new analyses but also 
highlights the strong assumptions required to identify causal 
quantities that may be of interest. Such strong assumptions are 
often implicitly made but not acknowledged in the conventional 
analysis. By drawing attention to these strong assumptions and 
evaluating them in the context of the 4Rs application study, we 
hope that subsequent studies can be designed and data can be 
collected that may render these assumptions more plausible so 
that investigators can estimate the mediated effects of substan- 
tive interest. 

The article is organized as follows: Section 2 introduces no- 
tation and definitions of the causal effects. Section 3 presents 
the assumptions required for identifying direct, indirect, and 
spillover effects. Section 4 describes the analytical procedure 
and gives the estimation results. Section 5 concludes by dis- 
cussing the strengths and limitations of the study. 


2. NOTATION, DEFINITIONS, AND FRAMEWORK 


In the 4Rs study, the randomized treatment occurred at the 
school level; the mediator was measured at the classroom level 
and the outcome at the child level. We adapt the notation and 
multilevel mediation framework of VanderWeele (2010a) to ac- 
commodate this setting; and we apply and extend the work on 
interference of Hong and Raudenbush (2006) and Hudgens and 
Halloran (2008) to consider the spillover effects of interest. Let 
T;, denote the school-wide randomized treatment (1 for the 4Rs 
intervention; 0 for control) for school k. Let Mj, denote the 
classroom level mediator for classroom j in school k indicating 
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classroom quality. Let J; denote the number of classrooms in 
school k. Let Yj, denote the child level outcome for child i in 
classroom j and school k indicating child depressive symptoms. 


2.1 Total Effects 


To define the causal effects of interest we proceed by introduc- 
ing potential outcome variables corresponding to values of the 
outcome under possibly contrary to fact treatment and mediator 
scenarios and also values of the mediator under possibly con- 
trary to fact treatment scenarios. In particular, let Yj,(t,) denote 
the potential or counterfactual outcome that child i in classroom 
jand school k would have obtained if the school-level treatment, 
Tx, were set to t,. Similarly, let Mjx(t,) denote the potential or 
counterfactual mediator that classroom j in school k would have 
obtained if the school-level treatment, 7;,, were set to t,. We 
assume that children do not change schools as the result of 
assigning a particular school to a certain treatment. Hong and 
Raudenbush (2006) referred to this assumption as that of “intact 
clusters.” We also assume that there is no interference between 
schools, that is, the treatment received at one school does not 
affect the outcomes of the children at any other school. Hence, 
our discussion considers only interference within schools, not 
interference between schools. 

With this notation, we can define the effect of the treatment on 
the outcome, E[Yjx(1) — Yj(0)], and the effect of the treatment 
on the mediator, E[Mjx(1) — Mj(0)]. These effects correspond 
to the difference in average child outcomes and that in classroom 
quality scores had all schools received the 4Rs interventions as 
compared with if none of the schools had received the interven- 
tion. Because the treatment is randomized, these causal effects 
can be estimated using a simple intent-to-treat analysis. In this 
article, we are not, however, interested simply in the overall 
effect of the treatment on classroom quality and on child social- 
emotional outcomes but rather also in the extent to which the 
effect of the 4Rs intervention on child outcomes is mediated 
by classroom quality. To define the direct and indirect effects 
of interest, we will need to introduce additional counterfactual 
quantities below. 


2.2 Potential Outcomes Incorporating Interference 


In studying mediation, we need to consider potential out- 
comes if both the treatment and the mediator had been set to 
values possibly other than what they in fact were. Because child 
outcomes may depend not only on the quality of the child’s own 
classroom but also on the quality of other classrooms, we need 
to incorporate such within-school interference into our potential 
outcomes notation. Let Yjz(t,, mjx, mj.) denote the counterfac- 
tual outcome that child i in classroom j and school k would have 
obtained if the school-level treatment in school k were set to f;, 
if the quality in classroom j of school k were set to m;,, and if the 
quality of all classrooms in school k other than classroom j were 
set to the vector m_ jx = (11K, ..-, Mj—1k, Mj+ik, ---,My,4). AS 
we discussed in the Introduction, the theory underlying the 4Rs 
intervention suggests that such interference would indeed be 
present. Note that here, for a particular school, either the en- 
tire school is treated (J; = 1) or the entire school is untreated 
(T; = 0), whereas in Hudgens and Halloran (2008) the treatment 


is at the individual level so different clusters may have a pro- 
portion treated anywhere between 0 and 1. 

Following Hong and Raudenbush (2006) and Hudgens 
and Halloran (2008), we assume that the potential outcome 
Vije(te, Mjx, M_jx,) depends on m_jx through some scalar func- 
tion G(m_,) of m_j, so that we may express the potential 
outcome Vit, mix, M_jx) aS Vijx(te, mjx, GQM_jx)). For exam- 
ple, G(m_j,) may denote the average classroom quality for all 
classrooms in school k other than classroom j. In some social- 
emotional outcome settings, it may be more reasonable for the 
function G(m_j;,) to be the geometric mean of classroom qual- 
ity for all classrooms in school k other than classroom j so 
that if there is greater dispersion in classroom quality, the mean 
will be less than would be the case if the quality of the vari- 
ous classrooms were uniformly consistent. The assumption that 
G(m__j,) is a scalar function of m__j, will not in fact be necessary 
for any of our identification results below but simplifies mod- 
eling considerably. In practice, it is unlikely that there would 
be sufficient data to avoid assumptions on G(m__jx) in model- 
ing; but our identification results themselves will hold even if 
G(m__j,) is the entire vector m_jx. Here we let Yix(t,m, g) de- 
note the outcome for child i in classroom j and school k if the 
school received treatment ft, the child’s classroom was at the 
quality level m, and the scalar function of the quality of other 
classrooms, G(m__jx), took the value g . In our discussion of me- 
diation below we will also use the notation M_j,(¢,) to denote 
(Mix(ty), «- +, Mj-ie(te), Mjric(te), «--, Mae(te)). 


2.3 Controlled Direct Effects 


With this notation in place, we can draw upon the literature 
on causal inference for direct and indirect effects (Robins and 
Greenland 1992; Pearl 2001) and on mediation in a multilevel 
context (VanderWeele 2010a) to define the direct and indirect 
effects of interest in a group-randomized trial. Note that we 
essentially have two mediators of interest here, the quality of a 
child’s own classroom and the quality of other classrooms at the 
school. The causal contrast 


E[¥ix, m, g) — Yix(O, m, g)] 


provides a measure of the direct effect of the 4Rs program 
but also intervening to fix the classroom quality of the child’s 
own classroom to level m and intervening to fix the average 
quality of other classrooms to g. This quantity is referred to as a 
controlled direct effect of treatment intervening to fix the values 
of the mediators to a particular level irrespective of treatment 
T,. The contrast at different levels of m and g could also be 
used to assess whether the effect of treatment on the outcome 
varies with a child’s own classroom quality or the quality of 
other classrooms. Likewise the contrast 


ElYVig(t, mM, g) ~ Viet, m*, g)] 


could be used to assess the effect of a child’s own classroom 
quality (comparing levels m and m*) on a child’s outcome. We 
may examine whether the effect of classroom quality depends 
on the presence of the 4Rs intervention or whether it depends 
on the quality of classrooms other than the child’s own (ie., 
whether the contrast varies with ¢ or g). Similarly, the contrast 
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could be used to assess the spillover effect on a child’s outcome 
of the quality of classrooms other than the child’s own. We may 
examine whether this spillover effect depends on the presence 
of the 4Rs intervention or whether it depends on the child’s own 
classroom quality (i.e., whether the contrast varies with ¢ or 
m). For example, it may turn out that a child’s social-emotional 
outcome is relatively unaffected by such spillover effects when 
a child’s own classroom quality is high but that behavior is more 
susceptible to spillover effects when a child’s classroom quality 
is low. 


2.4 Natural Direct and Indirect Effects 


An alternative conceptualization of a direct effect is what 
is sometimes referred to as a “natural direct effect’? (Pearl 
2001; Robins and Greenland 1992) which provides a mea- 
sure of the direct effect of treatment when the class- 
room quality mediators are set to the levels they would 
have been at under the control condition. The natural di- 
rect effect in this setting is defined as E[Yjx(1, Mjx(0), 
G(M_x(0))) — Yijx(O, Mjx(0), G(M_jx(0)))]. We can also de- 
fine a natural indirect effect as E[Yj.(1, Mj(1), G(M_j.(1))) — 
Yije(1, Mjx(0), G(M_;x(0)))]. As in the case of nonclustered 
treatments without interference (Pearl 2001), the total effect 
of the intervention on the outcome, E[Yj(1) — Yix(0)], decom- 
poses into a natural direct and indirect effect: 


EYijx(1) — Yie(O)] = ELV, Mj), G(M_jx(1))) 
— Yije(O, Mjx(O), G(M_jx(0)))] 
= El Vix, Mj), G(M_jx())) 
— Vie, Mjx(0), GCM_jx(0)))] 
+ E[V ie, Mjx(0), GOM_j.(0))) 
— Yix(O, Mjx(O), GM_jx(0)))] 


where the first expression in the sum is the natural indirect effect 
and the second expression is the natural direct effect. Note that 
the decomposition is achieved simply by adding and subtracting 
the term E[Yjx(1, Mjx(0), G(M_;,(0)))]. 

The decomposition will hold irrespective of the functional 
form of Yjx(t,m, g). In particular, the decomposition will hold 
even if there are interactions between the effects of the treatment 
and the mediator on the outcome, that is, if the controlled direct 
effect E[Yi(1,m, g) — Yi(0, m, g)] is not constant across m 
and g. Robins and Greenland (1992) referred to the decomposi- 
tion above as one of a total effect into a pure direct effect and a 
total indirect effect; a decomposition is also possible into a total 
direct effect and a pure indirect effect (Robins and Greenland 
1992; Robins 2003): 


E(Yije(l) — Yie(O)] = Ei, Mj), G(M_jx(1))) 
— Yixe(O, Mj), GM je(1)))] 
+ E[Vix(O, Mj), GOM_jx(1))) 
— Vije(O, Mjx(O), G(M_jx(0)))]; 


in other words, the decomposition is not unique. The pure direct 
effect need not equal the total direct effect and similarly the pure 
indirect effect need not equal the total indirect effect unless the 
effect of classroom quality on the outcome does not depend on 
treatment. 
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2.5 Mediated Spillover Effect and Within-Class 
Mediated Effect 


In fact, the natural indirect effect itself decomposes into an 
effect mediated through the quality of a child’s own classroom 
and a spillover effect through the quality of other classrooms at 
a school. Consider the contrast 


E( Yin, Mj), G(M_jx(1))) — Yi, Mj), GCM_jx(0)))]. 


This quantity compares child outcomes under treatment (with 
the quality of a child’s own classroom set to what it would be 
under treatment) but then contrasts what would happen if the 
quality of other classrooms were to set to the level they would 
be with treatment to what they would be without treatment. The 
contrast is one way to formally capture the effect of the treatment 
on a child’s outcome as mediated through the quality of class- 
rooms other than the one the child is in. We will refer to this 
contrast as the spillover mediated effect. Similarly, we could 
also consider the contrast E[Y¥ix(1, Mj(1), G(M_;x(0))) — 
Yijx(1, Mjx(0), G(M_;,(0)))]; this quantity compares child out- 
comes under treatment, with the classroom quality in other class- 
rooms set to what they would have been without treatment but 
then contrasts what would happen if the quality of a child’s 
own classroom were to set to the level it would be either with 
treatment to what it would be without treatment. The contrast is 
one way to conceive of the effect of the treatment on a child’s 
outcome as mediated through the quality of the child’s own 
classroom. We will refer to this contrast as the within-classroom 
mediated effect. Having defined these effects, we can now de- 
compose the natural indirect effect into the spillover mediated 
effect and the within-classroom mediated effect: 


E(Yije(l, Mj), G(M_jx(1))) — Yi, Mjx(0), G(M_jx(0)))] 
= E[Yijc(1, Mj), G(M_jx(1))) 
— Vie, Mj), GOM_jx(0)))] 
+ E[Yic(l, Mj), G(M_jx(0))) 
— Vie, Mjx(O), GOM_jx(0)))]. 


As with the decomposition of a total effect into a natural direct 
and indirect effect, the decomposition of a natural indirect effect 
into a mediated spillover effect and a within-classroom mediated 
effect is not unique. 

From our discussion above, it follows that we can decompose 
a total causal effect into the sum of (i) a spillover mediated effect, 
(ii) a within-classroom mediated effect, and (iii) a natural direct 
effect as follows: 


E(Yijx(1) — Yie(O)] = Ei, Mj), G(M_jx(1))) 
— Vie, Mjc(1), G(M_jx(0)))] 
+ E[Vixe(, Mj), GOM_jx(0))) 
— Vie, Mjx(0), G(M_jx(0)))] 
+ E[V ie, Mjx(0), GOM_jx(0))) 
— Yijx(O, Mjx(O), G(M_jx(0)))]. 


In the next section, we consider the identification of these vari- 
ous direct, mediated, and spillover effects. 
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3. IDENTIFICATION OF DIRECT, INDIRECT 
AND SPILLOVER EFFECTS 


In this section, we extend the identification results of 
VanderWeele (2010a) on mediation to allow for the spillover 
effects and within-classroom mediated effects, described above, 
which arise because of interference. We also give results on the 
consequences of ignoring interference when it is in fact present. 
Let X;, denote child-level baseline covariates, Wj, denote 
classroom level baseline covariates and V; denote school level 
baseline covariates. Let X_;;, denote the vector of child-level 
baseline covariates for children in school k other than child 
i in classroom j, W_jx to denote classroom-level baseline 
covariates for classrooms in school k other than classroom j. Let 
My = (Mix, .--, Mae) and My(t) = (Mig(t),..., Max(t)). 
For sets of random variables A, B, and C, we will use 
A [1 BIC to denote that A is independent of B conditional 
on C. We will consider certain functions of the baseline 
covariates of other children in the classroom (or even at the 
school), h,(X_;;,), and of baseline covariates of classrooms 
other than a child’s own hy(W_jz). In practice, hy(X_;jx) 
will likely be fairly constant within a classroom which may 
have 20 or more children (since only one child is different 
in the computation of each of these means or functions) and 
thus may plausibly be replaced by the aggregate summary of 
Xj, for all children in classroom i and school k. To simplify 
notation, we let Cy, = (Xie, Wir, Vi, Wy (X_ijx), ho(W_jx)). 
We present five identification results, one for controlled direct 
effects, one for natural direct and indirect effects, one for 
the spillover mediated effect and within-classroom mediated 
effect, and two for the consequences of ignoring interference 
when it is in fact present. Several of the counterfactual 
independence assumptions will follow immediately from 
the randomization of treatment; however, we will state the 
assumptions in such a way so as to allow the results to also 
be applicable to observational settings. As we move through the 
presentation of these results, we will see that each subsequent 
result requires stronger and stronger assumptions. In our actual 
analysis of the 4Rs data, we evaluate the plausibility of each of 
these assumptions and choose to rely on some of the less strong 
assumptions. 


3.1 Total and Controlled Direct Effects 
Under Interference 


Since treatment 7; is randomized, it will be independent of 
all pretreatment covariates and also of the potential outcomes 
as these can be viewed as (unobserved) pretreatment covariates. 
We thus have that 7, LL {Yi(t,m, g), Cy} and thus also that 
for all t, m, g, 


Yin(t, m, 8) LL Th |Cix, () 


which is sufficient for identifying the total effect. We can 
now state our first result concerning controlled direct ef- 
fects. The result requires not only that treatment is random- 
ized so that (1) holds but also that selection into mediator 
levels of classroom quality is effectively random conditional 
on treatment, 7;, and the covariates Cj = (Xx, Wx, Vi, by 
(X_i jx), h2(W_,j,)). Proofs of all theorems are given in the on- 
line supplement. 


Theorem I. If treatment 7; is randomized then (1) will hold. 
If, in addition, we have that for all t, m, g, 


Yije(t, m, 8) LL {Mjx, G(M_jx)}| Te, Cijx (2) 


then the average counterfactual outcome Yjz(t,m, g) condi- 
tional on C,, is identified and is given by: 


E(Vx(t,m, g)|Cix = ¢] = E[Vin|Tk = t, My =m, 
G(M_jx) = g, Ci =]. (3) 


Under assumption (2), we could apply Theorem 1 twice (once 
for t = 1 and once for t = 0) and sum (3) over C;x to obtain the 
controlled direct effect of treatment, 


E[Yix(1, m, 8) — Yix(O,m, 8)] 
= DEM inl Te = 1, Me =m, GML jx) = 8, Cix = €] 


c 


— ElYinelT = 0, Mj =m, G(M_jx) = g, Ci = e]} 
x P(Cix = ce). (4) 


Similarly, we could obtain expressions for the average effect 
on the outcome of a change in the quality of a child’s class- 
room, while holding the school-level intervention fixed and the 
classroom quality of the other classrooms fixed, 


E[Yix(t, m, g) — Yijxe(t, m*, g)] 
= \ {EW jul Te = t, Mx =m, GM_x) = 8, Cix = 


c 
— E[YVie|T, = t, Mix = m*, G(M_jx) = g, Cx = €]} 
x P(Cix = ©). (5) 


We could also obtain expressions for the average effect on the 
outcome of a change in the measure of the quality of other class- 
rooms from level G(M_jx) = g to level G(M_jx) = g* while 
holding the school-level intervention fixed and the classroom 
quality a child’s own classroom fixed, 


= \ {ELV julTe = t, Mx = m, GM_x) = 8, Ciz = 
Cc 


— E[VinelT, = t, Mp = m, G(M_jx) = 9*, Ci = €]} 
x P(Cix = €). (6) 


The conclusion of Theorem | still applies if the independence 
assumptions (1) and (2) hold only in mean rather than in distri- 
bution. 

We now turn to the interpretation of condition (2). The qual- 
ity of a classroom, Mjx, will in general depend on the particular 
teacher in classroom j and on which children are assigned to 
classroom j. For a particular child i in classroom j and school k, 
the potential outcomes Y(t, m, g), as t, m, and g vary, can be 
thought of as how well a child would fare under different sce- 
narios concerning treatment, classroom quality, and the class- 
room quality of other classrooms. Condition (2) is contingent 
on the process by which children are organized into classrooms 
and assigned to teachers which thereby gives rise to the qual- 
ity of the classroom. Condition (2) requires that, for a particular 
child, conditional on that child’s baseline covariates, Xj, certain 
characteristics of the school and teachers, Wx, Vx, ho(W_jx), 
and some measure of the baseline covariates of other chil- 
dren, h,(X_,;;,), this process of class assignment is independent 
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of how well the child would fare (i.e., of the potential out- 
comes Yjz(t,m, g)). If the schools’ assignment mechanisms use 
only information on Cj = (Xijx, Wie, Vi, bi CX ix), ho (W_jx)) 
to place children into classrooms, or if the information they use 
beyond Cj, is unrelated to the potential outcomes, then this 
may be a reasonable assumption. However, even if children are 
randomly assigned to classrooms, assumption (2) may yet not 
hold because a child’s own characteristics may have an effect on 
classroom quality; the value of the mediator is partially caused 
by the child’s membership in the class. This issue may be par- 
tially circumvented by stratification or control for covariates that 
relate to child characteristics that may affect classroom quality; 
the issue will also be partially mitigated in settings with larger 
class sizes. 

For the analysis described below to be valid, the covariates 
Cix, for which control is made, must not be affected by treat- 
ment. If covariates required for assumption (2) to hold are af- 
fected by treatment then an alternative identification approach 
for direct effects would be needed (Pearl 2001; VanderWeele 
2009). Note that assumption (2) is effectively also presupposed 
by the conventional mediation analysis, though sometimes left 
unstated; moreover with conventional mediation analyses, ef- 
fort is often not devoted to trying to control for covariates that 
may render assumption (2) more plausible. Finally, the conven- 
tional analysis in addition to making these assumptions does not 
accommodate treatment—mediator interactions. 

Even under assumption (2), the conditional expectation 
EYijelTk = t, Mix = m, G(M_jx) = g, Cie = ¢] must be mod- 
eled. This could be done by a multilevel regression model. Co- 
variate overlap for the variables in C across strata defined by 
T, M, and G should be checked to avoid extrapolation of the 
regression model to regions where data are not available. We 
discuss modeling further below. 


3.2 Natural Direct and Indirect Effects 
Under Interference 


The expressions above for the various controlled direct effects 
ELYin(1, m, g)—Yin(O,m, gl, EW ye(t. m, g)—Yie(t, m*, g)| 
and E[Y x(t, m, g) — Yixe(t, m, g*)] can be useful in assessing 
how important each individual factor—treatment, quality 
of a student’s own classroom, and the quality of the other 
classrooms—is in determining child-level outcomes while 
holding the other factors fixed. However, to assess mediation, 
it is also of interest to consider the extent to which the effect of 
treatment, 7;,, on child-level outcomes, Yjx, is mediated through 
the other two factors, quality of a student’s own classroom, Mjx, 
and quality of the other classrooms, G(M__j,). For this, we need 
to consider the effect of treatment on classroom quality in addi- 
tion to assessing the effect of treatment and classroom quality 
on the outcome. Because treatment is randomized, it will be the 
case that T, LL {Mjx(t), G(M_jx(t)), Cie} and also that for all ¢, 


(Mjx(t), GME jx(1))) LL Tie! Cix- (7) 


This brings us to our second identification result which allows 
for the identification of the natural direct effect and the total 
natural indirect effect. In a third identification result below, we 
will consider the identification of spillover mediated effects 
and within-classroom mediated effects. 


Journal of the American Statistical Association, June 2013 


Theorem 2. If treatment 7, is randomized then (1) and (7) 
will hold. If, in addition, (2) holds and we furthermore have that 
for all t, t*, m, g, 


Y(t, m, g) LL {Mjx(t*), GOM_je(t*))} Caz (8) 


then the average counterfactual outcome Yjx(t, Mjx(t*), 
G(M_jx(t*))) conditional on Cj, is identified and is given by 


E[Yiyx(t, Mjx(t*), G(M_jx(t*))) | Ci = €] 
= ae EY ix\Tk = t, Mj, = m, GM_jx) = g, Ci = €] 


m g 


x P(Myx =m, G(M_jx) = g|T = t*, Cx = ©). (9) 


Assuming (2) holds, assumption (8) then essentially re- 
quires that there is no post-treatment covariate that is an ef- 
fect of treatment 7, that in turn affects both the mediator 
levels of the various classrooms and also the child-level out- 
comes (Pearl 2001). This may be plausible if changes to the 
mediator occur not long after the treatment. (See also Pearl 
(2001) and VanderWeele (2010a) for further discussion.) We 
note that without assumption (8), there is no data to identify 
EY it, My(t*), GOM_j(t*)))|Cix = €]; it is assumption (8) 
that allows us to draw inferences about the counterfactual con- 
ditional expectation E[Y x(t, Mjx(t*), G(M_jx(t*)))|Cix = €] 
from the observed conditional expectation E[Yjx|T = t, Mj. = 
m, G(M_jx) = g, Ci, = ¢] and the distribution of the mediator. 
Of course, even under assumption (8), sufficient data would have 
to be available to model E[Yjx|T, = t, Mj = m, G(M_jx) = 
g, Cijx = c] so as to be able to extrapolate from the observed 
data. As above, covariate overlap for the variables in C across 
strata defined by T, M, and G should be checked to avoid ex- 
trapolation of the regression model to regions where data are 
not available. 

The conclusion of Theorem 2 also holds in an observational 
setting under independence assumptions (1), (2), (7), and (8). 
Note that if 7; is randomized then (8) implies (2). Note also that 
the conclusion of Theorem 2 still applies if the independence 
assumptions (1), (2), and (8) hold only in mean rather than in 
distribution; independence assumption (7), however, must hold 
in distribution. 

We can apply Theorem 2 to obtain empirical expressions for 
the natural direct effect and the natural indirect effect. Under 
assumptions (2) and (8), the natural direct effect described in 
the previous section is given by 


E[Yijx(1, Mjx(0), GOM_jx(0))) — Yin, Mje(0), G(M_jx(0)))] 
= 05 Ein = 1, Mix = m, G(M_jx) = g, 


mg 

Ci = c] — ELV in| Te = 0, Mie = m, G(M_jx) = 8, 

Ci = c]} x P(My = m, GM_jx) = g|T, = 0, Cx = ©) 
x P(Cijx = ©) (10) 


and the natural indirect effect described in the previous section 
is given by 


E(Yin(1, Mie(1), GM je(1))) — Yin, MeO), GM .(0)))] 
= DDD Ein Te = 1, Mx =m, GOM_) = 8, 


m g 


Ci = €] x {P(My = m, GM_jx) = 9|T = 1, 
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Ci = ©) — P(My = m, GM_jx.) = 8|Tk = 0, Cx = ©} 
x P(Cix =e). (11) 


We also note, employing an idea in Peterson, Sinisi, and van 
der Laan (2006), that to identify the natural direct effect in (10), 
assumption (8) could be further weakened to require only that 
{Yiu(, m, g) — Yie(O, m, g)} LL {Mjx(0), GOM_.(0))}| Cie for 
all m, g. 


3.3 Within-Classroom and Spillover Mediated Effects 


As noted above, the natural indirect effect can be further 
decomposed into a spillover mediated effect and a within- 
classroom mediated effect. Our next theorem gives an identifi- 
cation result for these spillover and within-classroom mediated 
effects. In addition to assumptions (2) and (8), the result will 
require one further assumption to identify these effects. 


Theorem 3. If treatment 7; is randomized then (1) and (7) 
will hold. If, in addition (2) and (8) hold and we furthermore 
have that for t’ # ¢*, 


M(t") LL GOM_jx(0")) | Cie (12) 


then the average counterfactual outcome Yjgz(t, M(t’), 
G(M_jx(¢*))) conditional on C;, is identified and is given by 


E[Y x(t, Myc(t’), G(M_jx(t*)))|Cix = €] 
=> >) ElYult = t, Mp = m, GM_g) = g, Ca = 


m g 
x P(Mix =m|Tk = t', Cix =C) 
x P(G(M_jx) = 8|Tk = 1", Cix = ©) (13) 


Theorem 3 still applies if the independence assumptions (1), 
(2), and (8) hold only in mean rather than in distribution; inde- 
pendence assumptions (7) and (12), however, must in general 
hold in distribution. 

The interpretation of assumption (12) is that conditional on 
a child’s baseline covariates, Xj, certain characteristics of the 
school and teachers, Wjx, Vz, ho(W_jx), and some measure of 
the baseline covariates of other children, h,(X_,j), information 
on the classroom quality of other classrooms had there been no 
intervention, G(M_j,(0)), gives no information on the quality 
the child would have received under the intervention, Mj,(1). 
This is a fairly strong assumption and it may not hold in many 
settings. Using twin network diagrams (Pearl 2009), it can be 
shown that this assumption would be violated if an intervention 
to change classroom quality in one class would also have an 
effect on classroom quality in other classrooms. 

Essentially, this would only hold if there is no spillover ef- 
fect of an intervention on the mediator to the mediator value of 
other classrooms. Note that this would still be compatible with 
there being a spillover effect of the mediator in one class on the 
outcomes of children in another class. The former spillover con- 
cerns spillover of an intervention on the mediator; the latter con- 
cerns spillover of the mediator on the outcome. In the substantive 
context of the 4Rs study, the assumption that there is no spillover 
effect of an intervention on the mediator (classroom quality) in 
one classroom to the mediator values of other classrooms would 
effectively require that the teachers from different classrooms 


in the same school do not interact with one another by way of 
sharing information on teaching techniques, classroom manage- 
ment, and so forth. In this educational context, this assumption 
will likely not be plausible. A context in which the assumption 
may be more plausible would be if the randomized treatment 
were acommunity-building grant administered at a district level, 
the mediator were new youth clubs at the neighborhood level, 
and the outcome were child delinquency behaviors. An interven- 
tion creating a youth club in one neighborhood arguably would 
not change the creation of a youth club in another neighbor- 
hood (i.e., no spillover of an intervention on the mediator to the 
mediators of other neighborhoods) even though there may well 
still be spillover effects of the mediator of one neighborhood 
on the outcomes of individual children in other neighborhoods 
because children often form peer groups across neighborhood 
boundaries. 

If the assumptions of Theorem 3 are satisfied then it may 
be applied to obtain empirical expressions for spillover me- 
diated effects and within-classroom mediated effects. Under 
randomization and assumptions (2), (8), and (12), the within- 
classroom mediated effect described in the previous section is 
given by 


E(YVieC, Mj), G(M_jx(0))) — Vie, Mjx(O), G(M_jx(9)))] 
=> ElYinl Te = 1, Me = m, GML) = 8, 


mg 
Ci = €] x {P(My = mT, = 1, Cx = ©) 

— P(Mx = m|T, = 0, Ci = ©} 

x P(G(M_jx) = gl1T = 0, Ci, =Q)P(Cyx =) (14) 


and the spillover mediated effect described in the previous sec- 
tion is given by 


E(Yie(1, Mj), GM je) — Yi, Mj), G(M_;x(0)))] 
= > yy E[Yije|Tk = 1, Mix = m, G(M_jx) = 8, 


m g 
Cix = c] x P(Mix = m|Ty = 1. Cix = Cc) 
x {P(G(M_jx) = glT = 1, Cix = ©) 
— P(G(M_jx) = g|Tk = 0, Cie = ©)}P(Cixe = 0). (15) 


3.4 Consequences of Ignoring Interference 


Our final results are concerned with the consequences of 
ignoring interference and the conditions under which inter- 
ference can be ignored while still obtaining direct and indi- 
rect effect estimates with a meaningful causal interpretation. 
When interference is absent so that Yjz(t,m, g) = Yijx(t, m) 
for all t, m, g then the natural indirect effect would simplify 
to ELV ix(t, Mjx(t))] — EYije(t, Mj(t*))] which, under assump- 
tions (1), (2), (7), (8) with Y(t, m, g) = Y(t, m), one could esti- 
mate by 


Yodo ElYinlTe = t, Ma = m, Cix = €] 
c m 


x P(Mx = mT, = t, Cix = CP (Cie = ©) 
a >>) ElYialT =t, My =m, Cix = €] 


c m 


x P(My = m|Ty = t*, Cie = €)P(Cyx = €). (16) 
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Likewise, without interference the natural direct effect re- 
duces to E[Yi(t, Mjx(t*))] — ELVijn(t*, Mje(t*))] which under 
assumptions (1), (2), (7), (8) with Y(t, m, g) = Y(t, m), one 
could estimate by 


YS ElYinl Ti = t, Mx =m, Cix = €] 
c m 
x P(M = m|Ty = t*, Cie = €)P(Cijx = ©) 
= >) >) ElYiael Ti =t*, My =m, Cx = €] 
c 


m 


x P(My = m|T, = t*, Cie = 0)P(Ci, =). (17) 


Note, we have in general by definition that E[Yj,(t, Mj(t*))] = 
E[Yig(t, Mix(t*), G(M_jx(¢)))].. However, with interference 
present, the empirical quantity used in expressions (16) and 
(17) to estimate E[Yj(t, Mjx(t*))], namely, 


> >> EY! Ti = t, My =m, Cy = ¢] 
c m 
x P(My = m|T, = t*, Cie = ©) P(Cixe = ©) 

will no longer necessarily equal E[Yix(t, Mj(t*))] = 
El Vi(t, My(t*), GCM_j,(¢)))] under assumptions (1), (2), (7), 
and (8) alone. Theorem 4 gives assumptions under which the 
equality will hold in the presence of interference. Theorem 4 
will allow us to consider what the interpretation of the quanti- 
ties in (16) and (17) in fact is when interference is present but 
is not taken into account. 


Theorem 4. Suppose that (1), (2), (7), (8), and (12) hold 
(i.e., the conditions used to identify within-classroom mediated 
effects and spillover mediated effects) and suppose moreover 
that (12) holds not just for t’ 4 ¢* but also for t’ = ¢* that is, we 
also require for all f, 


MyAt) LL G(M_jx(t))| Cie. (18) 
then 


>) >) El¥ial Ti = t, My =m, Cy = ¢] 


x P(My = m|T = t*, Ci = ©) P(Cix = ©) 
= ElVig(t, M(t"), G(M_jx(t)))]. 


It follows from Theorem 4 that if we attempted to estimate 
the natural indirect effect, ignoring interference, using (16) we 
would in fact obtain 


E[Yix(t, Mjx(t), G(M_jx(t)))] 
— E[Y¥in(t, Mi(t"), GOM_ (0) 
that is, a within-classroom mediated effect. It also follows im- 
mediately from Theorem 4 that if we attempted to estimate the 


natural direct effect, ignoring interference, using (17) we would 
in fact obtain 


E[Yie(t, M(t"), G(M_x())1 
— E[Yige(t", Mj"), GME je(")))] 
= ELYijxe(t, Mjx(t"), G(M_jx(0)))] 
— [Yielt, Mit"), GM") 
+ {E[Yie(t, M(t"), G(M_xe(t")))] 
— E[Yig(t™, Mit"), GME je(t")))]}, 
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that is, the sum of a spillover mediated effect and the actual 
natural direct effect. 

Intuitively, this result concerning the natural direct effect is 
not so surprising. If part of the effect of treatment on the outcome 
is mediated by the classroom quality in other classrooms and 
we ignore this and consider only the effect mediated through a 
child’s own classroom then the direct effect will capture both 
the true natural direct effect, E[Yix(t, Mjx(t*), G(M_je(t*)))] — 
El Yije(t*, My(t*), G(M_jz(t*)))] , and also the spillover medi- 
ated effect since the mediator being considered in the analysis 
ignoring interference is just the quality of the child’s own class- 
room. 

Several important points merit attention here. First, the as- 
sumptions under which this intuitive result holds are rather 
strong. To ignore interference in the way described above and 
still obtain meaningful causal interpretations of the estimators in 
(16) and (17), one makes all of the assumptions used to identify 
the within-classroom mediated effect and the spillover medi- 
ated effect but moreover one also assumes (18). Second, even if 
these assumptions hold, if the substantive question of interest is 
whether classroom quality mediates the effect of treatment and 
one uses the estimator in (16), this will essentially be an under- 
estimate of the actual importance of classroom quality since it 
will not include an assessment of the effect mediated through the 
quality of other classrooms (which will instead be captured by 
the estimate of the natural direct effect ignoring interference). 
Third, if these assumptions, (1), (2), (7), (8), (12), and (18), do 
not hold then it is not clear that what is being estimated as a 
natural direct and indirect effect ignoring interference have any 
meaningful causal interpretation at all. 

We have discussed the interpretation of assumptions (1), 
(2), (7), (8), and (12) above; we now consider assumption 
(18). Assumption (18) states that conditional on a child’s 
baseline covariates, Xj, certain characteristics of the school 
and teachers, Wj, Vi, ho(W_jx), and perhaps some measure 
of the baseline covariates of other children, hy(X_j;,), infor- 
mation on the classroom quality of the other classrooms un- 
der treatment f, that is G(M_j,,(t)), gives no information on 
the classroom quality of the child’s own classroom also un- 
der treatment t, Mj(t), beyond that of the baseline covari- 
ates, Cx = (Xie, Wie, Vi, i (Xi), ho(W_jx)). This assump- 
tion could in some sense be interpreted as one of an absence of 
resource constraints and lack of communication among teach- 
ers within schools. A similar result to Theorem 4 also holds for 
controlled direct effects which we state as our next theorem. 


Theorem 5. Suppose (1), (2), (7), (8), and (18) hold, then 
the estimator for the controlled direct effect that would be used 
ignoring interference, namely, 


EY ie|Tk = t, Mix = m, Cie = €] 
— E[VixlT = t*, Mix = m, Ci = €] 


in actual fact identifies 
ELYie(t, m, GOM_je(t)))| Cie = €] 
— ELYiu(t*,m, GM n(P))Cjx = el. 


Theorem 5’s interpretation is essentially that under assump- 
tions (1), (2), (7), (8), and (18), if we attempt to proceed with 
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estimating controlled direct effects ignoring interference then 
what we think we are estimating as the direct effect not through 
the mediator will in fact also pick up the effect that is mediated 
through classroom quality in classrooms other than the child’s 
own. 

As already noted, the assumptions employed in Theorems 4 
and 5, which allow one to ignore interference, are even stronger 
than the assumptions used to identify the effects allowing for 
interference. We finally also note that if interference is ignored 
for the treatment then the covariate aggregates for other children 
and classrooms, namely hy (X_;), hy(W_,;,) will likely also be 
ignored in which case (1), (2), (7), (8), (12), and (18) would all 
have to hold without conditioning on hy (X_j,) and hy(W_ jx). 

In summary, (1) and (7) will hold under randomization. Under 
(1) and (2) we can identify the controlled direct effects of the 
treatment, classroom quality, and quality of other classrooms. 
Under (1), (2), (7), and (8), we can identify natural direct and 
indirect effects. Under (1), (2), (7), (8), and (12), we can fur- 
thermore identify the within-classroom mediated effect and the 
spillover mediated effect. Under (1), (2), (7), (8), (12), and (18), 
we could ignore interference and yet still identify the within- 
classroom mediated effect and also the sum of the natural direct 
effect and the spillover-mediated effect. 


4. ANALYSIS OF THE 4Rs EDUCATIONAL 
INTERVENTION 


4.1 Sample and Data 


We apply the theoretical results in the previous section to 
a study of the 4Rs educational intervention. The study (Jones, 
Brown, and Aber 2011) involved a 3-year, 6-wave longitudinal 
experimental design with measurements in the fall and spring 
semester of each year. The 18 New York City elementary schools 
in the study were fairly representative of the demographic com- 
position of New York City schools generally. The schools were 
pair matched at baseline to minimize within-pair multivariate 
distance based on 20 school characteristics including size, av- 
erage reading achievement, race/ethnic composition, mobility/ 
2-year stability, school lunch receipt, expenditures, attendance, 
and organizational readiness. Within each pair, schools were 
randomly assigned to either the 4Rs treatment or the control 
group. The intervention is implemented school-wide, grades 
K-6 for 3 years; all third grade children in each school were 
followed over 3 years through fifth grade. In the application 
here, we will consider the first year of the study for the chil- 
dren beginning in third grade. The sample includes 82 third 
grade classrooms and 942 children. Schools in both treatment 
and control conditions had an average of three classrooms per 
grade level, ranging from smaller schools in both groups with 
between two and three classrooms in each grade level to larger 
schools such as a school with between four and nine classrooms 
in the treatment group and a school with between five and seven 
classrooms in the control group. 

Classroom quality was measured using the CLASS scoring 
system (Pianta, La Paro, and Hamre 2005) which assesses 
instructional support, emotional support, and organizational 
climate with an overall score between 1 and 7 . The internal 
reliability for this scale was 0.93. The measure assesses the 
quality of the interactions among students and teachers through 


Table 1. Sample of classes in M-by-G categories in each 
treatment group 


T=1 T=0 
M=0,G=0 12 17 
M=0,G=1 7 6 
M=1,G=0 9 13 
M=1,G=1 17 1 
Total 45 37 


which students are afforded opportunities to experience pos- 
itive connections to their peers and teachers in well-regulated, 
organized classroom settings. Note that this is a measure of 
classroom quality and cannot, in any sense, be taken as “true” 
classroom quality. Our analyses must, therefore, be interpreted 
with respect to “measured classroom quality” as the mediator. 
In prior research (Buchinal et al. 2010), the overall measure of 
class quality showed a threshold effect on child outcomes. We 
observed a similar threshold effect at a cut-off of approximately 
4.4. Hence, for simplicity in this example, the quality of a 
child’s classroom mediator, Mjx, was dichotomized so that 
My = 1 if the class measure was greater than or equal to 
4.4 and Myx = 0 otherwise. Likewise, the average quality of 
classrooms other than the child’s own, Gj = G(M_jx), was 
dichotomized with respect to the same threshold: we define 
Gj = 1, if the average score was greater than or equal to 4.4 
and Gj, = 0 otherwise. Table 1 shows the number of classes in 
each M-by-G category within each treatment group. Covariate 
overlap was assessed for the covariates by strata defined by T 
and by dichotomized values of M and G as described below. 

The outcome was child depressive symptoms scored on 
a scale of 0 to 1. Child level covariates included child 
race/ethnicity, gender, baseline depression, and baseline liter- 
acy skills; teacher covariates included teacher race/ethnicity and 
baseline confidence in behavior management. Because random- 
ization to the 4Rs program occurred within matched pairs of 
schools, the models also included pair-fixed effects to control 
for pair-specific factors. Table 2 lists the distribution of the 
outcome and of each covariate by treatment and class quality. 
Unsurprisingly, children attending low-quality classes in control 
schools displayed the highest level of depressive symptoms at 
the end of the treatment year. 


4.2 Assumptions 


In attempting to draw causal conclusions, a critical evalua- 
tion of the assumption is crucial in assessing the validity of the 
study results. In this subsection, we consider the assumptions 
employed in our analysis of the 4Rs data and relate the assump- 
tions considered in Section 3 to the 4Rs study. As the treatment 
is randomized, assumption (1) holds; assumption (7) would like- 
wise hold by randomization. By controlling for various child, 
teacher, and school level covariates we hope to render assump- 
tion (2) somewhat plausible so that the effects of the quality in a 
child’s own classroom and the effect of the quality of classrooms 
other than the child’s own are unconfounded. Classroom quality 
arises from a combination of the particular teacher and the par- 
ticular students in that class. Controlling for the aforementioned 
child and teacher characteristics within matched pairs of schools 
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Table 2. Distributions of outcome and covariates by treatment and class quality 


M=1 M=0 M=1 M=0 

Mean SD Mean SD Mean SD Mean SD 
Child outcome: Depressive symptoms 0.49 0.27 0.46 0.25 0.48 0.26 0.54 0.25 
Child baseline depressive symptoms 0.48 0.26 0.51 0.25 0.50 0.27 0.47 0.20 
Child baseline literacy skills 3.22 1.06 3.06 1.01 3.03 1.03 3.24 1.14 
Child gender (0 = boy; 1 = girl) 0.54 — 0.49 —_— 0.53 — 0.49 — 
Child Hispanic 0.48 — 0.41 —_— 0.54 — 0.44 — 
Child Black 0.38 — 0.47 —_— 0.29 — 0.43 — 
Child other non-White racial identity 0.09 — 0.08 — 0.13 — 0.07 — 
Teacher baseline confidence 6.02 0.90 4.99 1.84 5.89 0.94 5.43 1.47 
Teacher Hispanic 0.19 — 0.22 — 0.00 — 0.13 — 
Teacher Black 0.19 — 0.39 — 0.21 — 0.30 —_— 


removes a considerable amount of bias associated with selection 
into different class quality levels and helps render assumption 
(2) somewhat plausible. Conditioning on these observed covari- 
ates, if the assignment of students to teachers within a school 
is unsystematic, as indicated by evidence from field reports in 
the 4Rs study, this would also render assumption (2) somewhat 
more plausible. However, we are unaware of any comprehen- 
sive study on the dynamics of the assignment process. Under 
assumptions (1) and (2), we can estimate the controlled direct 
effects of the 4Rs intervention, the classroom quality mediator, 
and the spillover controlled direct effect of classroom quality in 
other classroom. 

To carry out the effect decomposition above so as to decom- 
pose a total effect into a natural direct effect, a within-classroom 
mediated effect and spillover mediated effect, we would also 
need to employ assumptions (8) and (12). Assumption (8) was 
that there were no mediator-outcome confounders affected by 
treatment; assumption (8) may be somewhat plausible insofar 
as we have used as our mediator the earliest available measure- 
ments of classroom quality after treatment was administered 
(see VanderWeele and Vansteelandt 2009), but it may be con- 
testable in this application in that class quality was measured 
in the second half, rather than the first half, of the school year. 
Assumption (12) is also problematic in the 4Rs study. Assump- 
tion (12) would require that an intervention to change classroom 
quality in one class must not affect classroom quality in other 
classrooms, that is, that there is no spillover effect of an interven- 
tion on the mediator to the mediator value of other classrooms. 
This assumption is compatible with there being a spillover effect 
of the mediator in one class on the outcomes of children in other 
classes. However, in the context of the 4Rs study, this assump- 
tion would effectively require that the teachers in each school 
do not interact with one another by way of sharing information 
on teaching techniques, classroom management, etc. Because 
teacher autonomy has been the norm rather than the exception 
in most U.S. schools (Lortie 1975), it might be argued that, at 
least for the control schools, a lack of communication among 
teachers may prevail. However, given that the 4Rs program was 
deliberately designed to enhance the professional skills of all 
teachers within a school in part through collaboration among 
teachers, spillover of the quality of one class to the quality of 


other classes arguably would occur in a 4Rs school. Assumption 
(12) is likely implausible in this application. As noted above, 
contexts in which assumption (12) might be plausible would 
be those in which there is no spillover of an intervention on 
one classroom mediator to the mediators of other classrooms, 
but there is still potential spillover from the mediator of one 
classroom to the outcomes of other classrooms. 

Because of the problematic nature of both assumptions (8) 
and (12), we do not in our application proceed with the es- 
timation of natural direct effects, within-classroom mediated 
effects and spillover mediated effects. We restrict the analysis 
to the estimation of the controlled direct effects of the 4Rs in- 
tervention and of the classroom quality mediator, and to the 
spillover controlled direct effect of classroom quality in other 
classrooms. Identification and estimation of these effects only 
require assumptions (1) and (2). 


4.3 Models and Estimation 


In Section 3, we described how, under certain assumptions, 
the various causal effects of interest were identified in terms 
of conditional expectations. Here we will describe a model- 
ing approach for estimating the controlled direct effects of the 
treatment, of one’s own class quality as a mediator, and of 
the quality of other classes in the same school. In the online 
supplement, we discuss a potential modeling approach for esti- 
mating natural direct and indirect effects and within-classroom 
and spillover mediated effects when these effects are identi- 
fied. Appropriate modeling will depend in part on the study 
design, on the data available, and on the causal quantities of 
interest. 

With data from this group randomized trial, we may estimate 
the total effect of the 4Rs program on child depressive symp- 
toms through analyzing a multilevel model reflecting the data 
structure in which third-graders were nested within classes that 
were in turn nested within schools: 


Yip = + BT, + yf (Cie) + €e + vj + Uigx With 
ex ~ NO, W), ve ~ NO, 7), wij ~ NO,07). (19) 


The model includes a school-specific random component €,, 
a class-specific random component v,x, and a student-specific 
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random component ux. These are assumed normally distributed 
and mutually independent. Because assumption (1) holds under 
randomization, treatment assignment is independent of these 
random effects. In model (19), f (Cj) is a function of covariates 
including child gender, race, baseline depressive symptoms and 
baseline literacy skills, teacher race, and baseline confidence in 
class management, along with fixed effects for matched school 
pairs. All covariates were centered at their sample means. To 
compare the 4Rs school and the control school within matched 
pairs, one might have alternatively included in the multilevel 
model random effects for matched pairs and then centered all 
the predictors at their respective means within matched pairs. 
This would be equivalent to our model using pair fixed effects 
(Raudenbush 2009). 

Preliminary examination of the data revealed nonlinear and 
nonadditive associations between treatment and teacher baseline 
confidence in class management in predicting child depressive 
symptoms. Teacher baseline confidence was measured on a 1-7 
scale. We identified a cutoff at 5 and specified different slopes 
below and above the cutoff under each treatment. This was the 
only covariate for which there was clear evidence of different 
slopes (details available from the authors upon request). Sub- 
stantive results were, however, generally similar even when not 
allowing for differing slopes for high and low teacher baseline 
confidence. Parameter values and model-based standard errors 
were estimated via maximum likelihood in HLM 7.0. The sam- 
ple size was insufficient to consider robust standard errors. 

Likewise, to estimate the effect of the 4Rs program on class 
quality measured on a continuous scale, we specified a two-level 
model with classes nested within schools: 


My = WH+KT, HAS (Cir) + €x + vy 
with €, ~ N(O, @), vie ~ N(O,¢). (20) 


Here €, and vj; are assumed normally distributed and mutu- 
ally independent. Given the randomization of treatment within 
matched pair of schools, treatment assignment is independent 
of these random effects. In model (20), f(Cj,) is a function 
of covariates including the aforementioned teacher baseline co- 
variates and pair-fixed effects. 

Finally, we fit a multilevel model for the effects of treatment, 
classroom quality, and quality of other classrooms on child de- 
pressive symptoms with the interactions between these variables 
saturated: 


Yijx = cool Myx = 0, Giz = 0) + Go (Mj = 0, Gx = 1) 
+ aol (My = 1, Gj = 0) + TM = 1, Ge = 1) 
+ BooTI(Mjx = 9, Gx = 0) 
+ Bor Tl (Myx = 0, Gx = 1) 
+ BioTh I (Myx = 1, Gi = 0) 
+ Butt (My = 1, Ge = 1) 
+ yf (Cie) + €x + vj + ijk 
with e, ~ N(O, w), vj, ~ N(O, T), Ui ~ NO, a). (21) 


Here €;, vj, and uj, are assumed normally distributed and mutu- 
ally independent and f(C;x) is a function of the aforementioned 
child and teacher baseline covariates and pair fixed effects. Un- 
der identification assumptions (1) and (2), the treatment indica- 
tor, class quality indicators, and their interactions are assumed 


independent of these random effects after adjustment for f (Cjx); 
(Mj = m, Gjx = g) are indicator function that take value 1 if 
My = m and Gj, = g and 0 otherwise. The class quality indi- 
cators could be associated with the random error terms if the 
identification assumptions do not hold. For example, after sta- 
tistical adjustment for the covariates, if class quality is generally 
higher in schools in which students entering third grade with a 
lower level of depressive symptoms on average, the effects of 
class quality would be estimated with bias. This is unlikely to 
be a problem here given that our covariate list has included in- 
dividual student’s baseline depressive symptoms. Unlike most 
conventional methods that assume no treatment—mediator inter- 
action, we explicitly model the direct effect of the 4Rs program 
as a function of one’s own class quality and of the quality of 
other classes. For simplicity, the model assumes that the direct 
effect does not depend on the covariate values. This assumption 
could also be relaxed. We checked covariate overlap for each of 
the variables in Cj, by strata defined by T;,, Mjx, Gj and found 
reasonably good covariate overlap. Average covariates values 
stratified by 7,, Mj, are given in Table 2. A larger table strati- 
fied by T,, Mjx, Gjx is available upon request from the authors. 
In general, covariate differences between groups are less than a 
standard deviation from each other. 

In model (21), under assumptions (1) and (2), dg correspond 
to mean potential outcomes E[Yjx(0,m, g)]; also Qing + Bing 
correspond to mean potential outcomes E[Y;z(1,m, g)]. The 
coefficients B,,z, correspond to the controlled direct effects of 
treatment, E[Yjx(1,m, g) — Yijx(0,m, g)]. The controlled di- 
rect effects of the quality of an individual’s own classroom, 
E[Yix(t, 1, g) — Yix(t, 0, g)], are ag — Gog when t = 0 and 
(12 + Big) — (og + Bog) when t = 1; the controlled direct 
effects of the quality of other classrooms, E[Yjx(t,m, 1) — 
Yi(t,m,0)], are Am) — mo When t =O and (m1 + Bmi) — 
(mo + Bno) when t = 1. The parameters from the model in 
(22), saturated for T,, Mjx, Gj, directly maps onto the con- 
trolled direct effects of interest. This mapping can be used as 
a guide for researchers interested in these effects. We estimate 
these parameters and effects with the 4Rs data below. 

Note that the conventional analysis for mediation would re- 
place model (21) with the following: 


Yin = + BT, + SMix + VF (Cijk) + €& + VjK + Migx 
with e& ~ N(O0, W), vjx ~ N(O, Tt), uijn ~ NO, a”), (22) 


where €;, uj and uj are again assumed normally distributed 
and mutually independent, ignoring interactions and ignoring 
potential interference. The conventional analysis would more- 
over often also ignore control for the covariate vector f(Cjjx). 
The conventional approach would take 6 in model (22) as the 
direct effect and the difference between 6 in model (22) and 6 
in model (19) as the indirect or mediated effect. As noted above, 
for linear models, this “difference method” for the indirect ef- 
fect gives the same results as would a “product-of-coefficients” 
method (MacKinnon 2008). For comparison, we will present 
also the results of this conventional analysis using model (22). 


4.4 Results 


Fitting model (19) to the data from the 4Rs study, we ob- 
tained an estimated total treatment effect of the 4Rs intervention 
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Table 3. Average controlled direct effects of treatment (T), quality of one’s own class (M), and quality of other classes in school (G) 


Controlled direct effect of T Coefficient $.e. t Dp 
M=0,G=0 Boo —0.035 0.046 —0.762 0.450 
M=0,G=1 Bou —0.154 0.051 —2.980 0.004 
M=1,G=0 Bio —0.022 0.046 —0.470 0.640 
M=1,G=1 Bu 0.016 0.076 0.211 0.834 

Controlled direct effect of M Coefficient $.e. x7 Pp 
T=0,G=0 10 — Ao0 0.019 0.047 0.168 >0.500 
T=0,G=1 11 — Aol —0.118 0.086 1.874 0.167 
T=1,G=0 (10+ Bio) — (A00+ Boo) 0.033 0.047 0.480 >0.500 
T=1,G=1 (@11+ Bi1)—(@oi1t+ Bor) 0.053 0.046 1.319 0.249 

Controlled direct effect of G 
T=0,M=0 Ao, — Ao0 0.085 0.056 2.345 0.122 
T=0,M=1 11 — 19 —0.052 0.083 0.392 >0.500 
T=1,M=0 (01+ Bor) — (@o0+ Boo) —0.034 0.057 0.359 >0.500 
T=1,M=1 (11+ Bit) — (10+ Bio) —0.014 0.054 0.069 >0.500 


on depressive symptom outcomes of —0.052 (se = 0.022, t = 
—2.305, p = 0.05), which suggests a marginally significant ef- 
fect of the treatment in reducing child depressive symptoms. 
Likewise, fitting model (20) to the data from the 4Rs study, we 
obtained an estimated treatment effect of the 4Rs intervention 
on classroom quality of 0.45 (se = 0.20, t = 2.28, p = 0.05), 
indicating an improvement in classroom quality due to the 4Rs 
intervention. 

Fitting model (21), we obtained estimates of the average con- 
trolled direct effects of T, M, and G; these results are sum- 
marized in Table 3. We applied t-tests to the controlled direct 
effects of T and chi-square tests with | degree of freedom each 
to the controlled direct effects of M and G. Among the four 
possible combinations of m and g, the 4Rs program showed a 
statistically significant direct effect on child depressive symp- 
toms only when a child attended a low-quality class surrounded 
by high-quality classes (i.e., 6o1). None of the controlled direct 
effects of M and of G appear to be different from zero; however, 
our sample size may be too small to detect these. 

We tested the null hypothesis of no interaction of treat- 
ment with either a child’s own classroom quality or the qual- 
ity of other classrooms, that is Boo = Bo1 = Bio = B11. The 
chi-square test result indicated that we could reject the null 
hypothesis (y* = 10.28, df = 4, p = 0.035). Furthermore, ac- 
cording to the result of a likelihood ratio test, a parsimonious 
model constraining Boo = 619 = 611 appeared to be equiva- 
lent to model (25). The parsimonious model gave an estimate 
of the average controlled direct effect of treatment —0.022 
(se = 0.029, t = —0.743, p = 0.461) for children whose own 
class quality and the quality of other classes in school were 
both low or both high or one’s own class quality was high 
while the quality of other classes was low. The estimated av- 
erage controlled direct effect of treatment for children whose 
own class quality was low while the quality of other classes 
in school was high remained statistically significant —0.154 
(se = 0.051, t = —3.006, p = 0.004). For these students, with- 
out changing class quality, simply being under the 4Rs program 
seemed to have the potential of reducing child depressive symp- 
toms by about 60% of a standard deviation. The 95% confidence 
interval for the effect size ranged from about 22% to 98% of a 
standard deviation. 


We now consider the results of fitting model (22) to the 4Rs 
data using the conventional approach to mediation analysis, and 
ignoring potential interference and potential treatment-mediator 
interaction. With the conventional analysis, fitting model (22) 
the direct effect of the treatment would have appeared to 
be —0.051 (se = 0.023, t = —2.233, p = 0.056), suggesting a 
marginally significant beneficial effect for reducing child de- 
pressive symptoms regardless of class quality. This “direct ef- 
fect,’ —0.051, is very similar to the estimate of the total effect 
reported above, —0.052, and so the conclusion of the conven- 
tional approach would be that almost none of the effect of the 
4Rs intervention on depressive symptoms is mediated by class- 
room quality. Note, however, that even under the best case sce- 
nario, that all of the assumptions of Theorem 5 hold, this “direct 
effect” under the conventional approach would still capture part 
of the effect of the 4Rs intervention that is mediated through 
classroom quality in classrooms other than the child’s own; in 
the presence of spillover, without the assumptions of Theorem 
5, the estimate under the conventional approach does not have 
a clear causal interpretation. The conventional approach misses 
the potential interaction and spillover that was suggested by our 
analyses above. 


5. CONCLUDING REMARKS 


In this article we have developed an approach for analyzing 
spillover effects in mediation analysis with data from group 
randomized trials. By relaxing the stable unit treatment value 
assumption (SUTVA) typical in causal inference, we have been 
able to precisely define spillover effects that will often be of 
substantive and theoretical interest. The 4Rs program evaluation 
presented in this article has provided an important case study 
in which interference is not simply a problem that must be 
addressed but in fact may be an important pathway mediating the 
treatment effect on individual outcomes. By taking into account 
potential interference among students from different classrooms 
in the same school, we were able to reveal that, for children 
attending low-quality classes in schools in which other classes 
have relatively high quality, the 4Rs intervention could possibly 
reduce child depressive symptoms through means other than 
improving class quality. 
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We have presented theoretical results for identifying the con- 
trolled direct effects of the treatment, of a student’s own class 
quality as a mediator, and of the quality of other classes in the 
same school as a concurrent mediator. In addition, the article 
has provided assumptions for identifying natural direct and in- 
direct effects, within-classroom mediated effects, and spillover 
mediated effects. The identification assumptions for these latter 
effects are particularly strong and the estimation of these effects 
was thus not pursued in our analysis of the 4Rs data. We hope 
that by presenting the assumptions, researchers will be more 
aware of what is required in the design and analysis studies 
which seek to estimate these various effects. For example, as- 
sumption (8) might have been more plausible had the mediator 
been measured much earlier in the year. 

However, even for the controlled direct effects of the treat- 
ment, the mediator, and the value of the mediator in other class- 
rooms, the approach that we have described here constitutes an 
important advance over the conventional approach to analyzing 
direct and indirect effects that is often used in group-randomized 
trials. This is because our approach (i) allows an individual’s po- 
tential outcomes to depend on one’s own mediator and on other 
individuals’ mediator as well as on the treatment (i.e., for in- 
terference), (ii) accommodates possible treatment-by-mediator 
interactions, (111) disentangles the causal effects of interest using 
a model that is saturated for the treatment, mediator and media- 
tor of other classrooms, and (iv) explicates the no-unmeasured- 
confounding assumptions required for identification. We have 
also importantly documented the consequences of ignoring in- 
terference when it is present. We saw that even in a best case 
scenario under which our various identification assumptions 
hold, estimates of direct effects ignoring interference actually 
also capture effects through the mediator values in classrooms 
other than a child’s own classroom. 

Our analysis of the 4Rs data made some strong assumptions 
and is therefore subject to some important limitations. First, the 
no-unmeasured-confounding assumptions required for identi- 
fication are quite strong; we have tried to make them more 
plausible by conditioning on relevant pretreatment covariates. 
Further research could also develop sensitivity analysis tech- 
niques (VanderWeele 2010b; Imai et al. 2010) to assess the 
extent to which an unobserved variable affecting both the me- 
diator and the outcome might invalidate inference about direct, 
indirect, and spillover effects. Such unobserved variables might 
give rise to confounding of the effects related to both the quality 
of a child’s own classroom and that of the quality of other class- 
rooms. Second, we have made the simplifying assumption in 
the current application that spillover effects depend on a mean 
scalar function; the assumption is not necessary in principal. 
However, without a very large dataset it would be difficult to 
make progress without some assumption on the form of inter- 
ference. Future work could consider using friendship network 
data to make alternative assumptions on the form of interfer- 
ence. Third, we have used a measurement of classroom quality; 
our analyses need to be interpreted with respect to measured 
classroom quality as the mediator; future work could consider 
consequences of measurement error. 

The analyses here have important implications for discerning 
in what contexts the 4Rs intervention is most effective. We saw a 


relatively large controlled direct effect of the 4Rs program when 
the child’s own classroom quality is low in a school in which the 
overall classroom quality of other classrooms is relatively high. 
This direct effect of the treatment may be mediated by other 
variables, possibly related to conflict resolution or reduction in 
bullying among children in the hallways or on the playground, 
raising new questions for further investigation. The analysis and 
results here are also important in that they provide a template 
and framework for other such analyses in education research 
and the social sciences more broadly. In group-randomized tri- 
als, interference and spillover effects are often important to con- 
sider for understanding the causal mechanisms of a group-level 
intervention. 


SUPPLEMENTARY MATERIALS 
Proofs for the theorems presented in the article. 


[Received December 2010. Revised October 2012.] 
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